Search CORE

30 research outputs found

BiplotGUI: Interactive Biplots in R

Author: Anthony la Grange
Niël le Roux
Sugnet Gardner-Lubbe
Publication venue
Publication date
Field of study

Biplots simultaneously provide information on both the samples and the variables of a data matrix in two- or three-dimensional representations. The BiplotGUI package provides a graphical user interface for the construction of, interaction with, and manipulation of biplots in R. The samples are represented as points, with coordinates determined either by the choice of biplot, principal coordinate analysis or multidimensional scaling. Various transformations and dissimilarity metrics are available. Information on the original variables is incorporated by linear or non-linear calibrated axes. Goodness-of-fit measures are provided. Additional descriptors can be superimposed, including convex hulls, alpha-bags, point densities and classification regions. Amongst the interactive features are dynamic variable value prediction, zooming and point and axis drag-and-drop. Output can easily be exported to the R workspace for further manipulation. Three-dimensional biplots are incorporated via the rgl package. The user requires almost no knowledge of R syntax.

Research Papers in Economics

BiplotGUI: Interactive Biplots in R

Author: Anthony la Grange
Niel le Roux
Sugnet Gardner-Lubbe
Publication venue: Foundation for Open Access Statistics
Publication date: 01/06/2009
Field of study

Biplots simultaneously provide information on both the samples and the variables ofa data matrix in two- or three-dimensional representations. The BiplotGUI package provides a graphical user interface for the construction of, interaction with, and manipulation of biplots in R. The samples are represented as points, with coordinates determined either by the choice of biplot, principal coordinate analysis or multidimensional scaling. Various transformations and dissimilarity metrics are available. Information on the original variables is incorporated by linear or non-linear calibrated axes. Goodness-of-t measures are provided. Additional descriptors can be superimposed, including convex hulls, alpha-bags, point densities and classication regions. Amongst the interactive features are dynamic variable value prediction, zooming and point and axis drag-and-drop. Output can easily be exported to the R workspace for further manipulation. Three-dimensional biplots are incorporated via the rgl package. The user requires almost no knowledge of R syntax

Directory of Open Access Journals

Journal of Statistical Software

Visualisation of quadratic discriminant analysis and its application in exploration of microbial interactions

Author: Dube Felix S
Gardner-Lubbe Sugnet
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/02/2015
Field of study

Background: When comparing diseased and non-diseased patients in order to discriminate between the aspects associated with the specific disease, it is often observed that the diseased patients have more variability than the non-diseased patients. In such cases Quadratic discriminant analysis is required which is based on the estimation of different covariance structures for the different groups. Having different covariance matrices means the Canonical variate transformation cannot be used to obtain a visual representation of the discrimination and group separation. Results: In this paper an alternative method is proposed: combining the different transformations for the different groups into a single representation of the sample points with classification regions. In order to associate the differences in variables with group discrimination, a biplot is produced which include information on the variables, samples and their relationship

Cape Town University OpenUCT

Springer - Publisher Connector

PubMed Central

Visualising Incomplete Data with Subset Multiple Correspondence Analysis

Author: Gardner-Lubbe Sugnet
Nienkemper-Swanepoel Johané
Roux Niël J. le
Publication venue
Publication date: 20/05/2021
Field of study

Determining the cause of missing values is a challenge, but an important task in order to select correct analysis techniques for missing data. This paper presents a new approach to identify the missing data mechanism (MDM) by applying cluster analysis to biplots of data having missing observations. Subset multiple correspondence analysis (sMCA) enables an isolated analysis of a chosen subset while preserving the scaffolding of the original data set. Multivariate categorical data sets are frequently represented in a coded dummy matrix, referred to as an indicator matrix. Additional category levels can be created for the indicator matrix to account for the unobserved information which has the advantage of not forfeiting any observed information. The extended indicator matrix easily partitions a data set into observed and unobserved subsets. sMCA biplots are used for the visual exploration of the subsets. Configurations of the incomplete subsets enable the recognition of non-response patterns which could aid in the identification of a particular MDM. The missing at random (MAR) MDM refers to missing responses that are dependent on the observed information and is expected to be identified by patterns and groupings occurring in the incomplete sMCA biplot. The missing completely at random (MCAR) MDMstates that all observations have the same probability of not being captured which could be identified by a random cloud of points in the incomplete sMCA biplot. The partitioning around mediods (pam) clustering technique is used to establish the number of available clusters in an incomplete sMCA biplot. A simulation study confirmed that there is a difference in the number of sufficient clusters that can by identified from MAR and MCAR simulated data sets. A real data set is also explored and the MDM is identified using the results of the simulation study as guidelines

KITopen

Succession and determinants of the early life nasopharyngeal microbiota in a South African birth cohort

Author: Claassen-Weitz Shantelle
Gardner-Lubbe Sugnet
Mounaud Stephanie H.
Mwaikono Kilaza S.
Nicol Mark P.
Nierman William C.
Workman Lesley
Xia Yao
Zar Heather J.
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/12/2023
Field of study

Background: Bacteria colonizing the nasopharynx play a key role as gatekeepers of respiratory health. Yet, dynamics of early life nasopharyngeal (NP) bacterial profiles remain understudied in low- and middle-income countries (LMICs), where children have a high prevalence of risk factors for lower respiratory tract infection. We investigated longitudinal changes in NP bacterial profiles, and associated exposures, among healthy infants from low-income households in South Africa. Methods: We used short fragment (V4 region) 16S rRNA gene amplicon sequencing to characterize NP bacterial profiles from 103 infants in a South African birth cohort, at monthly intervals from birth through the first 12 months of life and six monthly thereafter until 30 months. Results: Corynebacterium and Staphylococcus were dominant colonizers at 1 month of life; however, these were rapidly replaced by Moraxella- or Haemophilus-dominated profiles by 4 months. This succession was almost universal and largely independent of a broad range of exposures. Warm weather (summer), lower gestational age, maternal smoking, no day-care attendance, antibiotic exposure, or low height-for-age z score at 12 months were associated with higher alpha and beta diversity. Summer was also associated with higher relative abundances of Staphylococcus, Streptococcus, Neisseria, or anaerobic gram-negative bacteria, whilst spring and winter were associated with higher relative abundances of Haemophilus or Corynebacterium, respectively. Maternal smoking was associated with higher relative abundances of Porphyromonas. Antibiotic therapy (or isoniazid prophylaxis for tuberculosis) was associated with higher relative abundance of anerobic taxa (Porphyromonas, Fusobacterium, and Prevotella) and with lower relative abundances of health associated-taxa Corynebacterium and Dolosigranulum. HIV-exposure was associated with higher relative abundances of Klebsiella or Veillonella and lower relative abundances of an unclassified genus within the family Lachnospiraceae. Conclusions: In this intensively sampled cohort, there was rapid and predictable replacement of early profiles dominated by health-associated Corynebacterium and Dolosigranulum with those dominated by Moraxella and Haemophilus, independent of exposures. Season and antibiotic exposure were key determinants of NP bacterial profiles. Understudied but highly prevalent exposures prevalent in LMICs, including maternal smoking and HIV-exposure, were associated with NP bacterial profiles

Research Online @ ECU

Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens

Author: Claassen-Weitz Shantelle
du Toit Elloise
Gardner-Lubbe Sugnet
Mwaikono Kilaza S
Nicol Mark P
Zar Heather J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/05/2020
Field of study

Careful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and “denoising” approaches for sequencing low biomass specimens. Results We found that bacterial biomass is a key driver of 16S rRNA gene sequencing profiles generated from bacterial mock communities and that the use of different deoxyribonucleic acid (DNA) extraction methods [DSP Virus/Pathogen Mini Kit® (Kit-QS) and ZymoBIOMICS DNA Miniprep Kit (Kit-ZB)] and storage buffers [PrimeStore® Molecular Transport medium (Primestore) and Skim-milk, Tryptone, Glucose and Glycerol (STGG)] further influence these profiles. Kit-QS better represented hard-to-lyse bacteria from bacterial mock communities compared to Kit-ZB. Primestore storage buffer yielded lower levels of background operational taxonomic units (OTUs) from low biomass bacterial mock community controls compared to STGG. In addition to bacterial mock community controls, we used technical repeats (nasopharyngeal and induced sputum processed in duplicate, triplicate or quadruplicate) to further evaluate the effect of specimen biomass and participant age at specimen collection on resultant sequencing profiles. We observed a positive correlation (r = 0.16) between specimen biomass and participant age at specimen collection: low biomass technical repeats (represented by < 500 16S rRNA gene copies/μl) were primarily collected at < 14 days of age. We found that low biomass technical repeats also produced higher alpha diversities (r = − 0.28); 16S rRNA gene profiles similar to no template controls (Primestore); and reduced sequencing reproducibility. Finally, we show that the use of statistical tools for in silico contaminant identification, as implemented through the decontam package in R, provides better representations of indigenous bacteria following decontamination. Conclusions We provide insight into experimental design, quality control steps and “denoising” approaches for 16S rRNA gene high-throughput sequencing of low biomass specimens. We highlight the need for careful assessment of DNA extraction methods and storage buffers; sequence quality and reproducibility; and in silico identification of contaminant profiles in order to avoid spurious results

Cape Town University OpenUCT

Stellenbosch University SUNScholar Repository

Visualisation of quadratic discriminant analysis and its application in exploration of microbial interactions

Author: CT Braak
D Moore
Felix S Dube
H Hotelling
JC Gower
MJ Greenacre
R Fisher
Sugnet Gardner-Lubbe
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Multivariate data analysis identifies natural clusters of Tuberous Sclerosis Complex Associated Neuropsychiatric Disorders (TAND)

Author: de Vries Petrus J.
De Waele Liesbeth
Gardner-Lubbe Sugnet
Jansen Anna
Krueger Darcy
Leclezio Loren
Sahin Mustafa
Sparagana Steven
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2021
Field of study

Background Tuberous Sclerosis Complex (TSC), a multi-system genetic disorder, is associated with a wide range of TSC-Associated Neuropsychiatric Disorders (TAND). Individuals have apparently unique TAND profiles, challenging diagnosis, psycho-education, and intervention planning. We proposed that identification of natural TAND clusters could lead to personalized identification and treatment of TAND. Two small-scale studies showed cluster and factor analysis could identify clinically meaningful natural TAND clusters. Here we set out to identify definitive natural TAND clusters in a large, international dataset. Method Cross-sectional, anonymized TAND Checklist data of 453 individuals with TSC were collected from six international sites. Data-driven methods were used to identify natural TAND clusters. Mean squared contingency coefficients were calculated to produce a correlation matrix, and various cluster analyses and exploratory factor analysis were examined. Statistical robustness of clusters was evaluated with 1000-fold bootstrapping, and internal consistency calculated with Cronbach’s alpha. Results Ward’s method rendered seven natural TAND clusters with good robustness on bootstrapping. Cluster analysis showed significant convergence with an exploratory factor analysis solution, and, with the exception of one cluster, internal consistency of the emerging clusters was good to excellent. Clusters showed good clinical face validity. Conclusions Our findings identified a data-driven set of natural TAND clusters from within highly variable TAND Checklist data. The seven natural TAND clusters could be used to train families and professionals and to develop tailored approaches to identification and treatment of TAND. Natural TAND clusters may also have differential aetiological underpinnings and responses to molecular and other treatments

Cape Town University OpenUCT

Directory of Open Access Journals

Longitudinal Population Dynamics of Staphylococcus aureus in the Nasopharynx During the First Year of Life

Author: Felix Dube
Felix Dube
Heather J. Zar
Jordache Ramjith
Lourens Robberts
Mark P. Nicol
Mark P. Nicol
Mark P. Nicol
Polite M. Nduru
Shima M. Abdulgader
Sugnet Gardner-Lubbe
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2019
Field of study

Background:Staphylococcus aureus colonization is a risk factor for invasive disease. Few studies have used strain genotype data to study S. aureus acquisition and carriage patterns. We investigated S. aureus nasopharyngeal carriage in infants in an intensively sampled South African birth cohort.Methods: Nasopharyngeal swabs were collected at birth and fortnightly from 137 infants through their first year of life. S. aureus was characterized by spa-typing. The incidence of S. aureus acquisition, and median carriage duration for each genotype was determined. S. aureus carriage patterns were defined by combining the carrier index (proportion of samples testing positive for S. aureus) with genotype diversity measures. Persistent or prolonged carriage were defined by a carrier index ≥0.8 or ≥0.5, respectively. Risk factors for time to acquisition of S. aureus were determined.Results: Eighty eight percent (121/137) of infants acquired S. aureus at least once. The incidence of acquisition at the species and genotype level was 1.83 and 2.8 episodes per child-year, respectively. No children had persistent carriage (defined as carrier index of >0.8). At the species level 6% had prolonged carriage, while only 2% had prolonged carriage with the same genotype. Carrier index correlated with the absolute number of spa-CCs carried by each infant (r = 0.5; 95% CI 0.35–0.62). Time to first acquisition of S. aureus was shorter in children from households with ≥5 individuals (HR 1.06, 95% CI 1.07–1.43), with S. aureus carrier mothers (HR; 1.5, 95% CI 1.2–2.47), or with a positive tuberculin skin test during the first year of life (HR; 1.81, 95% CI 0.97–3.3).Conclusion: Using measures of genotype diversity, we showed that S. aureus NP carriage is highly dynamic in infants. Prolonged carriage with a single strain occurred rarely; persistent carriage was not observed. A correlation was observed between carrier index and genotype diversity

Directory of Open Access Journals

Development and feasibility of the self-report quantified TSC-Associated Neuropsychiatric Disorders Checklist (TAND-SQ) (120 characters of 120 max)

Author: Bissell Stacey
Byars Anna W.
Capal Jamie K.
Chambers Nola
Cukier Sebastián
Davis Peter E.
de Vries Magdalena C.
de Vries Petrus J.
De Waele Liesbeth
Flinn Jennifer
Gardner-Lubbe Sugnet
Gipson Tanjala
Heunis Tosca-Marie
Jansen Anna C.
Kingswood J. Christopher
Krueger Darcy A.
Kumm Aubrey J.
Sahin Mustafa
Schoeters Eva
Smith Catherine
Srivastava Shoba
Takei Megumi
van Eeghen Agnies M.
Vanclooster Stephanie
Waltereit Robert
Publication venue: 'Elsevier BV'
Publication date: 07/07/2023
Field of study

University of Birmingham Research Portal